perm filename NETS.DOC[DLN,MRC] blob sn#409595 filedate 1979-01-09 generic text, type T, neo UTF8
[Note from MRC:  People who should get mail on this topic are:
 MRC,JMC,Les,Hedrick@Rutgers,Ryland@Rutgers,Rindfleisch@Sumex,Yeager@Sumex]

This document is a critique of the protocols for DAILNET, and the proposed
Sumex TTYFTP protocol (hereafter refered to simply as the Sumex protocol).
It is based on conversations between Chris Ryland of Columbia University
and Charles Hedrick of Rutgers University.  We got involved in this when
we found that we were each about to become involved in networking projects
with similar goals, but with protocols sufficiently different that we would
not be able to talk to each other.  (Chris was involved in DIALNET and
Charles in Sumex.)  Since it seemed that there should be only one such
protocol, and since there seemed to be some problems with each of the
proposed ones, we decided to propose a single compromise approach.  

The criteria that we have used in evaluating the protocols are as follows:
  -  It should be possible to use them over any of the communications
	media currently in use on 10's, 20's, and 11's.  (Throughout this
	document the term 20 will be taken to include 10's running Tenex.)
	This includes not only Dialup lines, but also Telenet (and other
	commerical packet networks) and probably DECnet.
  -  It should be possible to implement the protocols in the monitor, for
	maximum efficiency and elegance.  But it should also be possible
	to write a single user-mode program to implement a subset of the
	functions.  The design of the protocol should not require a
	particular style of implementation.
  -  It should be possible to write a special purpose FTP program, or
	a general purpose system implementing FTP, mail, and Telnet
	connections.
  -  All of these implementations should be able to talk to each other,
	with the obvious limitations.  (E.g. a specialized FTP server
	will obviously refuse to do anything other than FTP.)
  -  The protocols must be solid, i.e. free of hanging or other
	anomolous behavior, even over noisy lines.
 
We believe that these criteria require that a bit more attention be given
to the way the protocols are broken up into levels.  For example, if one is
using a packet switching network such as Telenet, there is no reason to
impose an extra layer of packet-switching protocol on top of it.  Other
examples of improper hierarchical organization in the protocols will
become apparent shortly.  We propose the following levels:

 0 - error correcting.  This includes framing, checksumming, and the
	various op codes used to ACK and NAK.  This level is only
	needed over bare phone lines.  It is not desirable with
	X.25 and DECnet, which will already have similar code in the
	monitor.
 1 - multiplexing several users (or jobs).  Some implementations will
	be restricted to a single user, and will NAK any attempt to
	refer to a user other than 0.
 2 - multiplexing several channels for a single user.  This level is
	separate from level 1 because some networks (e.g. Telenet,
	Tymnet) already have the concept of multiplexing users, but
	not of individual channels in quite the sense of the Dialnet/
	Sumex proposals.  This is the highest level at which some
	packet structure is visible.  The op codes for EOF, channel
	interrupts, etc., are on this level.
 3 - applications protocols, e.g. FTP

All implementations will require all levels, but some may be done by
the carrier.  E.g. with Telenet the carrier does 0 and 1, and our
program does 2 and 3.  With DECnet, the DECnet hardware and software
does levels 0, 1, and 2, and the program only needs to do level 3.

The structures are the various levels may be summarized in the
following diagram.  Note that this is only the logical structure.
The actual representation of this data may be somewhat more compact,
as will be discussed later.

 0  frame
    op code
 ------------
 1  user #
 ------------
 2  channel #
    op code
    packet length
 ------------
 3  data, following tertiary protocol
 ------------
 ------------
 ------------
 0  checksum
    frame end

We envision that outer levels would strip data pertaining just to that
level.  For example at level 2 the packet structure is 
    channel #
    op code
    packet length
    data
This is necessary since Telenet, DECnet, etc., will in general have different
outer level protocols than ours.  Indeed our level 2 packets will be
"virtual packets" in this case, since the carrier's packet structure will be
transparent to us.  We will just see a stream of characters from
Telenet or DECnet.

Details of the protocols:

Level 0
----- -

The structure of Dialnet and Sumex are quite similar at this level:
SOP, user #, channel #, op code, packet #, acknowledged packet #,
<data>, checksum, EOP.

 - SOP/EOP.  Sumex chooses sequences that do not cause problems to
	operating systems.  They assume no ↑C's, e.g.  (They use
	an escape sequence to send control characters when they are
	in the data.)  We believe that in the end the software is
	going to want to use 8-bit image I/O.  So this distinction
	does not seem important to us.  The Sumex choice has the
	advantage of making it safer to debug programs, as they do
	not have to run with ↑C being intercepted.  The Dialnet
	choice has the advantage of using characters as specified
	by ANSI.  We can live with either choice.
 - channel #, user # - not part of level 0.  One can envision a
	carrier that does end-to-end error correction but has only
	one logical channel.  In that case we would want to be able
	to have all of the proposed protocol except the error 
	correction.  Hence just the error correction must be on a
	single level.
 - op codes - the proposals both mix op codes involved with error
	correction with those used in other levels.  The op codes
	relevant to level 0 are only
	  Dialnet:  NOP, WIN, MSG, NAK, ERR (?)
  	  Sumex:  ACK, NAK, AER, WIN, WAK, SUS, CON, ONE, RES, USR
	There are several differences hiding here:
	  - Acknowledgement strategies:  Dailnet acknowleges everything,
		exchanging NOP's every 5 seconds if there is nothing
		else to do.  Sumex uses the ATYP field to specify that
		certain messages do not need to be acknowledged.  This
		is needed to break the recursive "ACK loop".  We agree
		with Sumex that it should be possible to shut off
		ACK loops.  This is necessary to implement the SUS
		op code, if nothing else.  However their implementation
		using ATYP mixes functions of level 0 and 2, since ATYP=2
		indicates both "do not acknowledge" (level 0) and
		"end of message" (level 2).  We prefer a simple rule that
		one never needs to ACK an ACK, though can NAK it.  This
		should suffice to cut off loops.
	  - Sumex has some extra op codes: AER and WAK.  (Their ACK is
		just NOP, and their USR is just MSG.  SUS, CON, ONE, and
		RES will be discussed below.)  As far as we can see,
		they are for the case where erroneous data passes
		the checksum test.  This is a low probability event
		(about p**2/16, where p is the probability of a single
		bit error in a packet, using the Sumex checksum 
		algorithm.  It is even less likely using the Dialnet
		algorithm.)  However without this mechanism the program
		is left with nothing to do in the case where an ACK
		comes back with an error in it.  This could even happen
		due to a software problem.  We do not like protocols
		that provide no recourse in certain cases, even if they
		are unlikely.  Thus we prefer the Sumex approach.  (We
		assume it is not possible to prove the completeness
		property, as is done in their paper, without AER and
		WAK.)
	  - Sumex has some extra functions: suspend (SUS and CON) and
		single packet mode (ONE and RES).  SUS appears to be
		essential for single user programs.  (What do you do
		if he ↑C's the program and then continues it?  With
		the Dialnet protocol, the other end would assume a
		crash had happened.)  Single packet mode could also
		be useful for debugging and working with questionable
		lines.  So we suggest adopting these Sumex ideas.
	In summary, we propose using the Sumex level 0 op codes, with
	the change that there will be no ATYP field, but ACK will
	never be ACK'ed.  (If a NAK is lost, this is no problem, as
	the next transmission will redo the effect of the improperly
	received ACK.)
  - Checksum - Dialnet proposes an algorith due to Knuth.  Sumex
	proposes one of about the same complexity, but using rotate instead
	of multiply.  They argue that on some machines multiply could be
	costly.  We agree, assuming that there is not some dramatic
	advantage to Knuth's algorithm.  The algorithm could probably
	be improved somewhat without using a multiply by doing two rotates,
	one left and one right, and XOR'ing the results.
  

Level 1
----- -

This level requires only a user number.  Note that Sumex proposes to
limit this to 4 bits, to save space.  This gives 16 users.  That is
certainly sufficient for a 1200 baud line, as no user would get enough
bandwidth to do useful work with more than 16 users.  But if you consider
this protocol being used at high speed between two multi-user systems
(an application important to Columbia), one might need more than 16.
It is safer to allocate a whole byte.

There is one level 1 op code: NOU (no such user).  It indicates that
the previous message was for a non-existent user.  This is needed to
allow a full implementation to talk to a single-user program.  the
single user program would do no multiplexing, and return this code
whenever a packet arrived for a user other than 0.  NAK will not do in
this case because the sender would interpret this as a transmission
error, and retransmit the packet.  The packet containing the NOU code
should probably have the erroneous user number in its user number field.


Level 2
----- -

This level contains the data needed to implement the individual
channels.  It is the first level that our software will see in such
systems as Telenet.  The required data are

  - channel # - 3 or 4 bits is sufficient for any envisioned application.
  - op code - all op codes involving channel control.  Some possible ones
	are the following:
	  - EOF - end of file on this channel.  This must be done on level
		2, rather than by a signal contained in the data in a
		separate channel as proposed by Sumex.  In a
		full implementation, each channel will be implemented
		as a file by the operating system.  Clearly one wants
		to generate an end of file on the channel that is
		involved, not a signal on some other channel which is
		being read asynchronously on a different jfn.
	  - INT - interrupt the channel.  We are not clear that this
		would be needed, and exactly how it would work.  But it
		would obviously be at this level.
	  - EOR - end of record.  We are not sure that this is needed,
		either.  We believe that for most purposes, the channel
		is mostly easily taken as a continuous stream of bits.
		Again, if it is needed, it is an op code on this level.
		(DECnet does have this.)
	  - MRL - maximum record length.  Needed only if EOR is implemented.
		It must be done at this level, not in the tertiary
		protocol, as proposed by Sumex, since this level must
		allocate the buffers for building up the records.
	  - MPL - maximum packet length.  Probably not needed.  It might
		be necessary for flow control on systems not having
		XON and XOFF.  (i.e. you might have to use single packet
		mode, with packet length less than or equal to the size
		of the monitor's TTY buffers.)
	  - NOC - no such channel - indicates that a packet has been
		received for a channel that is not open.  In a full
		implementation, the monitor would have nothing to do
		with such a packet, as no jfn is open for that channel.
		It must return an error, and a special code is needed
		(since NAK would just cause retransmission of the same
		packet).
  - data size - from 0 (i.e. 0 means 1 and 255 means 256).  It is
	obviously most convenient to allow packets that are a power
	of 2, i.e. 256.  

Representation of level 0 

It is unclear how important compactness is.  The most obvious
representation is the following. Each line represents a byte.

SOP1
SOP2
packet #
packet # being acknowledged
user #
op code
channel #
data size
 .
 .
 .
checksum 1
checksum 2
EOP1
EOP2

The six header bytes can be reduced to 4 by following the Sumex packing:
  user # (4) | op code (4)
  chan # (3) | packet # (5)
	     | packet # acknow (5)
  data size (8)
The following limitations are involved:
  - limit of 16 users - almost always OK, but possibly not for Columbia.
  - limit of 16 op codes - We would represent the level 2 op codes as
	a single code LV2, and then use one data byte for the real
	op code.  There is no problem with this scheme.
  - limit of 8 channels - OK
  - limit of 32 packets in the window.  With a packet size of 256+8= 264,
	this is about 264 * 32 = 8448 bytes in the window.  At 9.6Kb,
	one gets about 1 byte/msec, i.e. have 8.4 sec. to acknowledge
	in order to keep from having a pause in transmission.  That
	seems safe enough.  However at 50Kb, you would have less than
	2 sec, which might be cutting it close in some networks.
It is unclear to me whether this protocol is ever going to be used on
nationwide 50Kb hookups with many users.  But it is probably wise not
to design the protocol to make it impossible.

Representation of inner levels:

If the unpacked representation is used, one might choose to pass
to inner levels only the part of the representation that concerns
them.  For example at level 2 one would probably have a representation
such as
  op code
  channel #
  data size
  data
  EOP1
The reason for EOP1 being present even in error-free data, is to provide
a break character for implementations that activate the program only
once for each packet.  It also provides some reduncancy to prevent utter
confusion in case "error-free" data turns out not to be totally error free.
Op code would either be USR or a level 2 op code.  Even if the packed
representation is used for transmission, processing would be made somewhat
easier if it were stored in memory in the unpacked form.  (Also, on
the PDP-10, the 6 bytes of the header would probably be put in 2 words,
with the data starting in a third word.)


Level 3
----- -

We prefer the Dialnet assumption that there will be several different
servers using different tertiary protocols, e.g. FTP, mail, and telnet.
If one only chooses to implement FTP, one need only return an error
code if the user tries to select some other server.

Also, we prefer the Dialnet FTP protocol.  It is more extensible.
Again, in the initial version, not everything needs to be implemented.
But one does not want a design that will be hard to extend.  We believe
that the Sumex proposal is a reasonable set of user commands.  But
they should be implemented internally with the Dialnet protocol.  For
the implementation envisioned by Sumex, USER, PASSWORD, etc., will not
be needed.  And a reduced set of DATA options will be available.
Probably
  (data 8 ascii file)  20 to 11 of TEXT
  (data 8 image file)  11 to 11 and 20 to 11 of DATA
  (data 36 image file)  all 20 to 20
It seems natural enough to implement the Sumex commands with the
Dialnet protocols.  E.g. 
  RETRIEVE X.Y TEXT
would result in
  ;assume Sumex is talking to an 11
  ;no access commands are required in the initial version
  -->  (DATA 36 IMAGE FILE)   ;try first to get a 20 to 20 link
  <--  (FAILED (BYTE 36 ILLEGAL))   ;not on a PDP11
  -->  (DATA 8 ASCII FILE)    ;try universal default
  <--  (OK)	;all servers must accept 8 bit ascii
  -->  (RETRIEVE (X.Y))
  <--  (OK (Look out!  Here comes [HEDRICK]X.Y;3))
  <--  (DONE (Transfer completed))
User now types Bye
  -->  (BYE)
  <--  (OK (CONNECTION CLOSED))


We still have to specify how connection to a given server is done.
Dialnet makes it a level 0 op code.  That is not a good idea, since
in the case of a commercial packet-switching network or DECnet, level
0 may not be under our control.  Sumex doesn't allow one to choose
servers at all.  We propose the following:

  - When an initial connection is made to a remote host, channel 0
	of all users is logically connected to an ICP handler.
  - The only legal command to the ICP handler is
	(ICP <server-name>).  This request is at the level of a
	tertiary protocol.  Such a request may come over channel 0
	of any user, and results in connection of that user to the
	requested server.  
  - When a server is finished, the user is reconnected to the ICP
	handler.


Implementation notes:

Implementation 1 - super fancy

 - Assume that everything is built into the monitor.  To open a channel,
	one opens device DLN:host.channel.user, where host is a host name
	for Telenet, or a phone number for a direct dialled connection.
 - A line is openned to that host.
 - Once a line is open, connections are initiated on channel 0 of any
	user.  Whenever a message is received over channel 0 of user N,
	and no channel of user N is already open, the monitor CRJOB's
	a job with ICPSER in it.  It then assigns user N to that job.
	(There is some question how that assignment should be done.
	Ideally the monitor should simply remember the correspondence,
	though an IPCF message could be send to the job telling it
	what user to use.)
 - ICPSER now opens file DLN:*.0.  This gives access to channel 0 of
	whatever host and user happens to be accessible to it.  It
	will now receive a message such as (ICP FTP) over that channel.
 - ICPSER now closes DLN: and starts the server, in this case FTPSER,
	on a subfork.  Actually there is no reason it can't simply
	run the server.  FTPSER needs 3 channels, so it opens
	DLN:*.0 to DLN:*.2.  If they are successfullly openned, it
	sends (OK (Rutgers/LCSR FTP server 1.23 5-Jly-79)) on
	channel 0.  (This is seen by the originating host as the
	positive reply to its message (ICP FTP)).  
 - When ICPSER is finished, it closes the 3 channels and kills itself.
	Possibly ICPSER now reopens channel 0, but it is probably
	easiest not to return to ICPSER, since the monitor will restart
	it when the next message is received on channel 0.

Implementation 2 - super dumb

All we have is a user mode program, TTYFTP.

 - The user opens a TTY connection, logs in, and starts TTYFTP.  TTYFTP
	just reads from the controlling TTY and expects that to be
	data following the protocol.
 - TTYFTP implements levels 0 through 3.  If it ever receives a message
	for a user other than 0, it returns NOU.  I.e. only one user
	at a time can use this program.
 - The top level loop expects to read a message (ICP FTP).  I.e.
	it only implements FTP.  The FTP "server" is simply the
	rest of the program.
The trick is to allow these implementations to talk to each other.  The
main problem is with initiating the connection.  Once it is done, all
implementations talk the same protocol.  Of course simple ones may not
implement all of the options, returning error codes when unimplemented
features are used.  There seem to be three main ways to start the
protocol handler:

 - a dedicated port - just dial it up (or open the line if it is hardwired).
	The port is always talking to the protocol handler.
 - a command built into the EXEC.  You would connect to any dialup port
	and type a special command instead of logging in.  This would
	start up the program directly.  Presumably such an implementation
	would require the (USER) command in FTP, since otherwise there is
	no way to validate file access.
 - log in and run a program

The following approaches could be used to handle these possibilities:

 - a simple implementation would have the user dial up a tty line.
	It would just cross-patch the user's tty to the line.
	He would then log in at the remote host and do whatever is
	necessary to start things.  When he had started the protocol
	handler, he would type some special control character
	which would break the link from his tty to the line and begin
	the program proper.  E.g.
	  @TTYFTP
	  What TTY?  TTY32:
	  TTY32: assigned.  Please dial it.  Type CR when done.
	  [You are now connected to the remote host.  Please start
	   their server, type ↑Z when done.]
	  ↑C
	  RUTGERS/LCSR Tops-20, version 3
	  @LOGIN HEDRICK
	  Job 32 logged in at 2300 4-Jly-79
	  @TTYSRV
	  ↑Z
	  [Returning to SUMEX]
	  [Connection open to remote host]
	  TTYFTP>  RETRIEVE X.Y TEXT
 - A system that implemented the net in the monitor might use the
	strategy just described to make the connection.  However after the
	connection is open, the person who made it would supply the
	connection program with a name.  It would then do a special
	MTOPR to inform the monitor that the line is now available
	for use as a network connection to the specified host.  Any
	user could then open it as something like DLN:RUTGERS.0,
	assuming RUTGERS was the name supplied when the connection was
	made.
 - Finally, one could maintain a list of known sites, with the
	appropriate login-startup sequence for each.  Then when you
	open DLN:RUTGERS.0, the autodialer would dial it, and the
	login sequence would happen automatically.


Notes on a Tops-20 implementation.

We believe that the initial implementation should be done as a user
mode program that allows only FTP transfers for one user at a time.  Such
an implementation should always be maintained to allow communication
with sites that are not prepared (or able) to put up an implementation
requiring monitor changes.

As for the need for monitor changes, there appear to be the
following sources of potential bottlenecks:
 - the front end.  We have verified that it is possible to send 9600
	baud data into the front end without dropping data.  It is
	not clear whether XON/XOFF was necessary to that.  (At 4800
	baud XON/XOFF was not needed.  At 9600 the results are not
	clear.)  These tests were done on a lightly loaded machine,
	but I believe that a small number of 9600 baud lines can be
	handled even on a more heavily loaded machine, if the front
	end is not near saturation already (e.g. all 128 possible
	lines in use).
 - the monitor tty code.  This is a more serious source of bottlenecks.
	I believe that the data will get into the big buffer OK, but
	each individual TTY line has a buffer of somewhere between
	50 and 120 characters (the documentation says 50, experiment
	strongly suggests 120).  So if data is coming in at 9600
	baud, the program must be activated and empty the input buffer
	every 50 - 120 msec, if data is to go at full speed.  If this
	does not happen, the XOFF/XON protocol should prevent loss
	of data, but the effective transfer rate will slow down.  There
	have been reports that XOFF/XON does not function properly in
	this mode, but we have no way to evaluate them.  DEC assures
	us that there is no  problem, but that it will be fixed in
	the next monitor release.  The situation may be improved if
	one reads the data with RDTTY or TEXTI, since the program
	itself does not need to be activated that often.  For example,
	one could do TEXTI, specifying as the break characters EOP1 and
	EOP2.  Then the program is only activated once for each
	packet.  One would prefer double buffering, but there does not
	seem to be any way to do that.   (Note: if this strategy
	is to be used, we cannot allow EOP1 or EOP2 in the data of
	the packet.  An escape coding would be needed, and the escape
	code itself could not involve EOP1 or EOP2.).  
It is not clear at this point exactly how much of a performance
problem there is going to be.  It is our guess that there will be
little if any problem at 1200 baud, and the initial use will probably
be at that speed.  At 9600 baud, we hope that XOFF/XON will protect
things.  If monitor changes are needed, one possibility is to try
to increase the size of the line buffer for lines being used in this
way.  Possibly an MTOPR could be used to specify use of extended
buffers.  (It might even make sense to use a page in the user's
address space for the buffers, and completely bypass emptying the
buffer character by character.)  Alternatively, one might build the
protocol into the monitor.  We think it may well prove unnecessary
to modify the front end code to do this.  Indeed it is desirable to
allow various lines to be changed back and forth between normal use
and network use.  The most convenient way to do that is to use the normal
front end code in both cases, and begin special processing at the point
where characters are removed from the big buffer to be transferred to
line buffers.  It would be most natural to allow a user (possibly
privileged) to do an MTOPR to specify that a given line is to be used
for net purposes.
 
It appears that DEC is going to be making changes to support buffered
terminals.  These changes may allow precisely what we need:  efficient
transfers of large blocks of data on tty lines.  Indeed some such changes
are scheduled to be present in the next monitor release.